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Abstract 

The angiosperm genus Silene exhibits some of the most extreme and rapid divergence ever identified in mitochondrial 
genome architecture and nucleotide substitution rates. These patterns have been considered mitochondrial specific based on 
the absence of correlated changes in the small number of available nuclear and plastid gene sequences. To better assess the 
relationship between mitochondrial and plastid evolution, we sequenced the plastid genomes from four Silene species with 
fully sequenced mitochondrial genomes. We found that two species with fast-evolving mitochondrial genomes, 5. noctiflora 
and 5. conica, also exhibit accelerated rates of sequence and structural evolution in their plastid genomes. The nature of 
these changes, however, is markedly different from those in the mitochondrial genome. For example, in contrast to the 
mitochondrial pattern, which appears to be genome wide and mutationally driven, the plastid substitution rate accelerations 
are restricted to a subset of genes and preferentially affect nonsynonymous sites, indicating that altered selection pressures 
are acting on specific plastid-encoded functions in these species. Indeed, some plastid genes in 5. noctiflora and 5. conica 
show strong evidence of positive selection. In contrast, two species with more slowly evolving mitochondrial genomes, 
5. latifolia and 5. vulgaris, have correspondingly low rates of nucleotide substitution in plastid genes as well as a plastid 
genome structure that has remained essentially unchanged since the origin of angiosperms. These results raise the possibility 
that common evolutionary forces could be shaping the extreme but distinct patterns of divergence in both organelle 
genomes within this genus. 
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Introduction 

Plants and other photosynthetic eukaryotes share the 
distinction of having plastids, an endosymbiotically derived 
organelle that coexists with mitochondria in the cytoplasm 
(Gould et al. 2008; Kim and Archibald 2009). There are clear 
parallels in the long-term evolution of the genomes of 
these two organelles. For example, both have experienced 
massive gene loss (Adams and Palmer 2003; Timmis et al. 
2004), which appears to be a universal pattern in obligately 
intracellular symbionts (Andersson and Kurland 1998; 
Moran and Wernegreen 2000). Organelle gene loss has 



generally been associated with transfer of genetic control 
to the nucleus, so most of the genes required for organelle 
function are located in the nuclear genome, including virtu- 
ally all of those responsible for the maintenance of organ- 
ellar DNA (Day and Madesis 2007; Sloan and Taylor 2012). 
Interestingly, many of the plant genes involved in DNA 
replication and repair in one organelle genome have related 
paralogs that function in the other (Zaegel et al. 2006; 
Shedge et al. 2007; Cappadocia et al. 2010). Moreover, 
the products of many nuclear genes are targeted to both 
organelles, including a disproportionate fraction of genes 
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associated with DNA synthesis and processing (Carrie et al. 
2009). Therefore, the evolution of DNA replication and 
repair machinery in organelles involves a complex history 
of gene transfer, co-option, duplication, retargeting, and 
replacement (reviewed in Sloan and Taylor 2012). 

Despite sharing components of their DNA replication and 
repair machinery, mitochondrial and plastid genomes differ 
greatly in their structural organization and evolution. For 
example, seed plant plastid genomes are gene-dense and 
exhibit a high degree of syntenic conservation (Raubeson 
and Jansen 2005). In contrast, seed plant mitochondrial 
genomes contain an abundance of noncoding sequence 
and experience rapid rates of rearrangement among and 
even within species (Mower et al. forthcoming). Mitochon- 
drial and plastid genomes also exhibit different rates of 
nucleotide substitution, which are believed to reflect under- 
lying differences in mutation rate. Rates of synonymous sub- 
stitutions are typically two to four times faster in plastid DNA 
than mitochondrial DNA in seed plants (Wolfe et al. 1987; 
Palmer and Herbon 1988; Drouin et al. 2008). However, 
a handful of seed plant lineages exhibit dramatic increases 
in mitochondrial substitution rates, reaching levels that are 
more typical of fast-evolving animal mitochondrial genomes 
(Cho et al. 2004; Parkinson et al. 2005; Bakker et al. 2006; 
Mower et al. 2007; Sloan et al. 2008, 2009; Ran et al. 201 0). 

Observed cases of rate acceleration in plant mitochon- 
drial DNA are often characterized as mitochondrial-specific 
phenomena because sequenced nuclear and plastid genes 
show little or no correlated increase in substitution rate (e.g., 
Cho et al. 2004; Mower et al. 2007; Sloan et al. 2008). 
Nevertheless, limited evidence suggests that some of 
these dramatic changes in mitochondrial rate may not be 
entirely independent of evolution in the plastid genome. 
For example, the Geraniaceae, which has experienced 
a series of extreme changes in mitochondrial substitution 
rate (Parkinson et al. 2005), also exhibits abnormally high 
rates of structural evolution in the plastid genome and ac- 
celerated substitution rates in a subset of plastid genes 
(Chumley et al. 2006; Guisinger et al. 2008, 201 1 ; Blazier 
et al. 2011). Likewise, gnetophytes exhibit elevated rates 
of evolution in both plastid and mitochondrial genomes 
(Mower et al. 2007; McCoy et al. 2008; Wu et al. 2009). 
However, comparisons based on complete genomes from 
both the mitochondria and the plastids are not available 
in these cases to assess the potential evolutionary parallels 
between the organelle genomes. 

We recently reported the complete mitochondrial ge- 
nome sequences of four Silene species with highly divergent 
mitochondrial substitution rates (Sloan et al. 2012). Two of 
these species (5. noctiflora and 5. conica) exhibit nearly 
100-fold increases in synonymous substitution rates, 
whereas the other two (5. latifolia and 5. vulgaris) have 
low rates that are more typical of other angiosperm mito- 
chondrial genomes. The accelerated rates in 5. noctiflora 



and 5. conica are associated with major expansions in mito- 
chondrial genome size, resulting in the largest known mito- 
chondrial genomes, as well as numerous other changes in 
mitochondrial gene and genome architecture. 

To assess the relationship, if any, between mitochondrial 
and plastid genome evolution in Silene, we sequenced and 
analyzed the complete plastid genomes from these same 
four Silene species. The species with fast-evolving mitochon- 
drial genomes (5. noctiflora and 5. conica) do not show 
evidence of comparable genome-wide increases in plastid 
synonymous substitution rates. However, they do exhibit 
substantial rate accelerations in a subset of plastid genes, 
particularly at nonsynonymous sites, suggesting that altered 
selection pressures are acting on specific plastid pathways or 
functions in these species. In addition, the 5. noctiflora and 
S. conica plastid genomes have experienced rapid structural 
evolution. In contrast, the 5. latifolia and 5. vulgaris plastid 
genomes are highly conserved relative to other angio- 
sperms. These results provide an example of recent and 
correlated accelerations in mitochondrial and plastid ge- 
nome evolution among closely related species, but the 
specific patterns of sequence and structural change differ 
between the two organelle genomes in many respects. 
We discuss the possibility that shared forces are acting, 
either directly or indirectly, on both mitochondrial and plastid 
genomes in Silene. 

Materials and Methods 

Source Material and Plastid DNA Extraction 

For each of four Silene species (5. latifolia Poir., 5. vulgaris 
[Moench] Garcke, 5. noctiflora L, and 5. conica L), approx- 
imately 200 g of fresh tissue was collected from multiple 
individuals from a single maternal family. The maternal fam- 
ilies and collection methods correspond to those previously 
described for mitochondrial genome sequencing (Sloan, 
Alverson, et al. 2010; Sloan et al. 2012). Intact chloroplasts 
were isolated using a combination of differential centrifuga- 
tion and separation on a sucrose step gradient (Palmer 
1986; Jansen et al. 2005). Chloroplasts were then lysed, 
and DNA was purified by phenohchloroform extraction. 
These preparations yielded between 4 and 20 jag of DNA 
per species. The purity of plastid DNA was confirmed by 
restriction digestion. 

Roche 454 and lllumina Sequencing 

For each plastid DNA sample, shotgun libraries were con- 
structed with multiplex identifier (MID) tags following stan- 
dard protocols for sequencing on a Roche 454 GS-FLX 
platform with Titanium reagents. MID-tagged libraries were 
sequenced as part of a larger pooled sample with each of 
the four species constituting the equivalent of 2.5% of a full 
454 plate. All 454 library construction and sequencing were 
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performed at the Genomics Core Facility in the University of 
Virginia's Department of Biology. 

Multiplex barcoded libraries were also prepared for 
paired-end sequencing on an lllumina GAII sequencing 
platform as described previously (Sloan et al. 2012). For 
5. noctiflora, plastid DNA was amplified with GenomiPhi 
V2 (GE Healthcare, Piscataway, NJ) to produce sufficient 
starting material for lllumina library construction. All other 
lllumina libraries (and all 454 libraries) were generated with- 
out whole-genome amplification. The barcoded libraries 
were sequenced as part of a larger pooled sample in a single 
lllumina lane on a 2 x 85 bp paired-end run with each spe- 
cies representing 8% of the pool, lllumina sequencing 
was performed at the Biomolecular Research Facility in 
the University of Virginia's School of Medicine. 

Genome Assembly and Annotation 

Shotgun 454 sequencing produced between 3.7 and 
7.1 Mb of sequence data for each species. These reads 
were assembled with Roche's GS de novo Assembler v2.3 
("Newbler") using default settings. Initial assembly pro- 
duced complete or nearly complete plastid genome sequen- 
ces. Sequencing coverage in single-copy regions for each of 
the four species ranged from 21 to 46 x. As expected, 
roughly twice those coverage levels were obtained for 
the inverted repeat (IR). The assemblies for each species con- 
tained as many as three gaps, but these generally reflected 
uncertainty regarding the length of long homopolymer (i.e., 
single-nucleotide repeat) regions. These regions were com- 
bined and then corrected with lllumina data (see below) to 
produce finished genomes. 

Roche 454 data are known to have high insertion and 
deletion error rates associated with long homopolymer 
regions. To correct errors in the 454 assembly, paired-end 
lllumina reads were mapped to the genome using SOAP 
v2.20 (Li et al. 2009) as described previously (Sloan et al. 
2012). After quality trimming and removal of multiplex bar- 
code sequences, the lllumina run produced between 40 and 
259 Mb of sequence with an average read length between 60 
and 65 bp for each species. This data set provided deep cov- 
erage for the entirety of all four genomes with an average 
read depth between 297 and 1400x. The lllumina mapping 
results were used to identify and correct between 50 and 96 
sequencing errors per genome, the vast majority of which 
were associated with homopolymer lengths. 

Protein, transfer RNA (tRNA), and ribosomal RNA (rRNA) 
gene content in each of the finished genomes was anno- 
tated using DOGMA (Wyman et al. 2004). The annotated 
genome sequences were deposited in GenBank (accessions 
JF715054-JF715057). 

Analysis of Genomic Inversions and Indels 

To reconstruct the history of large inversions in Silene plastid 
genomes, gene order and orientations in each genome were 



compared with the inferred ancestral state for angiosperms 
(Raubeson and Jansen 2005) using GRIMM v2.0.1 (Tesler 
2002). In addition, all four Silene plastid genomes were 
aligned with the outgroup Spinacia oleracea using MAUVE 
v2.3.1 (Darling et al. 2010). 

To identify and quantify the number of indels in each plas- 
tid genome, syntenic blocks of sequence for all four Silene 
species and the outgroup Spinacia oleracea were aligned 
using MUSCLE v3.7 (Edgar 2004). Intergenic regions 
containing inversion breakpoints were not included in this 
analysis. Large indels (>1 00 bp) were identified by manual 
inspection of the sequence alignments. In many cases, the 
size, number, and polarity of smaller indel events were 
ambiguous because multiple indels often overlapped in 
structurally variable regions. Therefore, to estimate the rel- 
ative frequency of smaller indels (<1 00 bp) in each species, 
we restricted our focus to the subset of events that are 
unique to a single species within the aligned data set and 
show no overlap with structural variants in the other four 
species. A custom Perl script was used to identify all indels 
meeting these criteria. 

Phylogenetic Analysis and Substitution Rate Variation 

To assess the phylogenetic relationships among the four Si- 
lene species, nucleotide sequences from all Silene protein 
genes and introns were aligned with the corresponding se- 
quences from the closest available outgroup, Spinacia oler- 
acea, as well as Arabidopsis thaliana (for protein-coding 
sequences only). Alignments were performed using 
MUSCLE v3.7 (Edgar 2004) and adjusted manually. Phyloge- 
netic analyses were performed with RAxML v7.0.4 on 
three different concatenated data sets: 1) all protein genes 
except accD, dpP, ycfl, and ycf2 (which were excluded 
because of extreme sequence and/or structural divergence 
in 5. noctiflora and 5. conica), 2) all protein genes in the 
photosynthesis-related atp, pet, ndh, psa, and psb com- 
plexes, and 3) all introns. RAxML analyses were performed 
with the following parameters: -f d, -b 1, -p 1, -#1,000, 
and -m GTRGAMMA. 

The relative rates of sequence divergence in the four Si- 
lene genomes (and the outgroups Spinacia and Arabidopsis) 
were analyzed using both codon- and nucleotide-based 
models of evolution in PAML v 4.4 (Yang 2007) as described 
previously (Sloan et al. 2009; Sloan, Alverson, et al. 2010). 
Because the phylogenetic relationships among the four 
Silene species are not confidently resolved, all PAML 
analyses implemented a constrained topology with the four 
Silene species radiating from a single polytomy. 
Protein-coding sequences were analyzed with codon-based 
models to quantify the rates of synonymous and nonsynon- 
ymous substitution, whereas RNA genes and intronic 
sequences were analyzed with nucleotide-based models. 
Analyses were performed on the following data sets: 1) 
a concatenation of all protein genes except accD (see 
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Table 1 

Summary of Silene Plastid Genomes 





S. latifolia 


S. vulgaris 


S. noctiflora 


S. conica 


Genome size (bp) 


151,736 


151,583 


151,639 


147,208 


IR 


25,906 


26,008 


29,891 


26,858 


Large single-copy region 


82,704 


82,258 


79,475 


80,129 


Small single-copy region 


17,220 


17,309 


12,382 


13,363 


G + C content (%) 


36.43 


36.25 


36.51 


36.12 


Protein genes 3 


77 


77 


77 


77 


tRNA genes 8 


30 


30 


30 


30 


rRNA genes 3 


4 


4 


4 


4 


Introns 3 


20 


20 


16 


16 


RNA editing sites b 


25 


26 


24 


24 



3 Gene and intron counts exclude putative pseudogenes and duplicate copies in the IR. 

b Editing site counts are from supplementary table S1 (Supplementary Material online) and include predicted sites that have not been confirmed by cDNA sequencing (see 
Materials and Methods). 



below); 2) separate concatenations of each of the following 
protein gene sets: atp, pet, ndh, psa, psb, rpl, rpo, and rps; 
3) each of the following individual protein genes: ccsA, 
cemA, dpP, matK, rbcL, ycfl ,ycf2, ycf3, and ycf4; 4) a con- 
catenation of all rRNA genes; and 5) a concatenation of 
all introns. The accD gene is too structurally divergent in 
5. noctiflora and 5. conica to produce a useful alignment 
that includes both species. However, a large portion of accD 
from each of these species can be separately aligned against 
the remaining species in the analysis. Therefore, two sepa- 
rate analyses of accD sequence divergence were performed, 
one involving 5. noctiflora and one involving 5. conica. 

To test for evidence of positive selection acting on indi- 
vidual genes or sets of genes, all loci with estimated c/ N /c/ s 
ratios greater than one in any Silene species were reanalyzed 
with the c/ N /c/ s ratio constrained to a value of one for that 
species. Likelihood ratio tests were performed to compare 
the constrained and unconstrained analyses and determine 
whether the estimated c/ N /c/ s ratios significantly exceed 
one (Yang 1998). Because we performed a total of 72 rate 
analyses in Silene protein genes (18 genes or gene sets 
for each of four Silene species), we applied a Bonferroni 
correction factor of 72 to account for multiple comparisons. 

RNA Editing 

In land plants, mitochondrial and plastid messenger RNA 
transcripts undergo systematic conversion of cytidines to 
uridines (C-to-U editing), restoring conserved codons 
(Knoop 201 1). RNA editing sites were previously identified 
by cDNA sequencing for a subset of plastid genes in all four 
of the Silene species analyzed in this study (Sloan, 
MacQueen, et al. 2010). To predict editing sites in other 
plastid genes, we aligned Silene genes against all protein- 
coding sequences from A. thaliana, Nicotiana tabacum, 
and Zea mays that are known to undergo RNA editing. 
Editing data for these three species were obtained from 
REDIdb (Picardi et al. 2007) and other published sources 



(Tillich et al. 2005; Chateigner-Boutin and Small 2007). 
Any site that is edited in one or more of these outgroups 
was predicted to be edited in Silene species that have 
a C at the corresponding genomic position (supplementary 
table S1, Supplementary Material online). A number of edit- 
ing sites appear to have been lost in one or more Silene spe- 
cies as a result of C-to-T substitutions at the genomic level. 
For any site that was predicted to vary in its editing status 
among the four Silene species, cDNA sequencing was per- 
formed as described previously (Sloan, MacQueen, et al. 
2010) to confirm editing in at least one species. The results 
of cDNA sequencing confirmed editing in all cases except for 
rps14 (nucleotide position 80). This site was predicted to be 
edited in 5. latifolia, S. vulgaris, and 5. conica but to have 
been lost by a genomic C-to-T substitution in 5. noctiflora. 
However, cDNA sequencing in 5. latifolia found no evidence 
of editing. Therefore, this site was excluded from the counts 
shown in table 1 and supplementary table S1 (Supplemen- 
tary Material online). 

Results 

Gene Content in Silene Plastid Genomes 

All four Silene plastid genomes are typical in size relative to 
other angiosperms and exhibit a classic circular genome 
map with a pair of large IRs separating two single-copy re- 
gions (fig. 1, supplementary fig. S1, Supplementary Material 
online and table 1). The four genomes share a gene com- 
plement that encodes 77 proteins, 30 tRNAs, and 4 rRNAs. 
Genes coding for the translation initiation factor A {infA) 
and the ribosomal protein subunit L23 (rpl23) appear to 
be present only as pseudogenes in the genomes of all four 
species and are not included in the totals above. These two 
genes have been lost independently in multiple angiosperm 
lineages, including other species within the Caryophyllales 
(Zurawski and Clegg 1987; Millen et al. 2001; Funk et al. 
2007; Logacheva et al. 2008). The infA gene has been 
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Fig. 1. — Plastid genome map for Silene latifolia. Boxes inside and outside the circle correspond to genes on the clockwise and anticlockwise 
strand, respectively. The inner circle depicts GC content. The positions of the IR are labeled on the inner circle and noted with thicker black lines on the 
outer circle. All differences >100 bp in IR boundary positions among the four sequenced Silene plastid genomes are labeled on the outer circle. 
Asterisks indicate genes that have lost introns in 5. noctiflora and/or 5. conica. Maps of all four Silene plastid genomes are provided as supplementary 
material (supplementary fig. S1, Supplementary Material online). This figure was generated with OGDraw v1.2 (Lohse et al. 2007). 



subject to repeated functional transfers to the nucleus 
(Millen et al. 2001), whereas there is evidence that rpl23 
has been functionally replaced by its cytosolic counterpart 
in other species (Bubunenko et al. 1994). In addition to 
the functional loss of infA and rpl23 f it is possible that some 
annotated genes in the Silene plastid genomes are pseudo- 
genes. For example, as reported previously (Sloan et al. 



2009), the intron-encoded open reading frame (ORF) matK 
contains an internal frameshift indel in 5. conica. The gene 
encoding the RNA polymerase oc subunit {rpoA) also con- 
tains a frameshift indel approximately 200 bp upstream 
of the normal stop codon position in three of the four Silene 
species. Silene latifolia and 5. noctiflora share a 7 bp deletion 
in rpoA, whereas 5. vulgaris has a 1 0 bp deletion in the same 



298 Genome Biol. Evol. 4(3):294-306. doi:10.1093/gbe/evs006 Advance Access publication January 12, 2012 



Recent Acceleration of Plastic! Sequence and Structural Evolution 



GBE 



60,000 



100,000 



120,000 



Silene noctiflora 



Silene conica 















Silene vulgaris 



iHiWiii Minifciiiiiiiidiiiiyi II 



Silene latifolia 

















Spinacia oleracea / I \ / \ 

psbM-trnD' ^trnT-psbD] accD-psar 

trnE-trnT psaA-ycf3 psal-ycf4 psbE-petL 







Fig. 2. — Structural alignments of 5/'/ene and Spinacia plastid genomes. The coloring identifies collinear sequence blocks shared by all five 
genomes. Bars drawn below the black line indicate sequences found in inverted orientation. The height of each bar reflects sequence similarity. The 
eight inversion breakpoints identified by GRIMM are labeled at the bottom. Only one copy of the IR is shown for each genome, and the orientation of 
the small single-copy region was reversed relative to its conventional presentation to minimize complexities associated with changes in the IR 
boundaries. This figure was generated with MAUVE v2.3.1 (Darling et al. 2010). 



region. In both matK and rpoA, frameshift indels have oc- 
curred in homopolymer regions and have introduced prema- 
ture stop codons. Finally, multiple genes are highly divergent 
in sequence and/or structure in 5. conica and 5. noctiflora 
(see below) and could be pseudogenes. Most notably, the 
entire 3' half of the accD gene is missing in 5. noctiflora. 

Rapid Structural Evolution in the Plastid Genomes of 
S. noctiflora and S. conica 

The 5. latifolia and 5. vulgaris plastid genomes show nearly 
perfect syntenic conservation with Spinacia (fig. 2) and 
other angiosperms including Amborella trichopoda (data 
not shown), suggesting that these two Silene species have 
maintained the ancestral angiosperm genomic structure 
(Raubeson and Jansen 2005). In contrast, the two species 
with fast-evolving mitochondrial genomes (5. noctiflora 
and 5. conica) have experienced numerous changes in plas- 
tid genome structure in just the few million years since the 
divergence of these four Silene species. These changes 
include multiple inversions, intron losses, large indels, and 
shifts in the IR boundaries. 

The rearranged gene order in the 5. noctiflora plastid 
genome suggests that it has experienced four inversions 
involving six breakpoints found within or between the 
following gene pairs: psbM-trnD, accD-psal, psbB-dpP, 
petL-psbE, psbD-trnT, trnT-trnE (fig. 2). The 5. conica 
genome appears to have experienced a single inversion with 
a pair of breakpoints (psaA-ycf3 and psal-ycf4) that are 
distinct from any of those involved in the 5. noctiflora 



rearrangements (fig. 2). At least some of the inversions 
are likely the result of recombination between short IRs, 
as has been observed in other angiosperms (Knox et al. 
1 993; Haberle et al. 2008). All four Silene species have a pair 
of divergent IRs (ca. 1 70 bp and 80% sequence identity) that 
coincide with the breakpoints for the 5. conica inversion. 
Outside of Silene, this sequence is widely conserved in seed 
plant plastid genomes but found only as a single copy. In 
addition, 5. noctiflora has a unique pair of IRs (154 bp, 
99% sequence identity) corresponding to the petL-psbE 
and psbD-trnT breakpoints. However, any repeats that 
may have been associated with other inversion events in 
5. noctiflora are not readily identifiable. Interestingly, the 
5. conica inversion interrupts a genomic fragment that 
was previously sequenced and analyzed in a number of 
species within the Sileneae including 5. conica (Erixon 
and Oxelman 2008a). This earlier study did not detect 
the inversion found in our analysis of the 5. conica plastid 
genome, which could indicate that it is polymorphic within 
the species. However, the 2008 study was based on poly- 
merase chain reaction (PCR) and Sanger sequencing of in- 
dividual fragments, so artifacts involving PCR-mediated 
recombination (Alverson et al. 2011) might also explain 
the earlier finding of a noninverted genome conformation 
in 5. conica. 

The 5. latifolia and 5. vulgaris plastid genomes share an 
identical complement of 19 group II introns (including 
the trans-splicing first intron of rps12; Koller et al. 1987) 
as well as a single group I intron in the trnL-UAA gene. 
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Fig. 3. — Synonymous (of s ) and nonsynonymous divergence (of N ) in 
Silene mitochondrial and plastid genomes as estimated with PAML. 
Plastid data are based on an analysis of all protein genes except accD. 
Mitochondrial data are from table 1 in Sloan et al. (201 2). The solid lines 
represent best fit trend lines for the plastid (black) and mitochondrial 
(gray) data. 

The 5. noctiflora and 5. conica genomes each lack four of 
these introns. Both 5. noctiflora and 5. conica have lost 
the rpoC1 intron as well as both introns in the fast-evolving 
dpP gene (Erixon and Oxelman 2008b). In addition, 5. nocti- 
flora has lost the rpl16 intron and 5. conica has lost the atpF 
intron. All five of these introns have been lost independently 
in other angiosperm lineages (Downie et al. 1 996; Campagna 
and Downie 1 998; Jansen et al. 2007). Like other members of 
the core Caryophyllales, all four Silene species lack the rpl2 
intron found in other plant lineages (Downie et al. 1991; 
Logacheva et al. 2008). In every case, missing introns have 
been precisely excised at their normal splicing boundaries. 

In addition to the deletions associated with intron loss, 
the 5. noctiflora and 5. conica plastid genomes have also 
experienced a total of 10 and 11 large indels of >1 00 bp 
in size, respectively. Ten of these indels were found in 
protein-coding sequences (all in the highly divergent 
accD, ycfl, and ycf2 genes), whereas the remaining 11 
are in noncoding sequences (10 in intergenic regions and 
one in the trnL-UAA intron). In contrast, no indel greater 
than 100 bp was found in either 5. latifolia or 5. vulgaris. 
The 5. noctiflora and 5. conica plastid genomes also have 
a higher frequency of small indels. Alignments of all four 
Silene species with the outgroup Spinacia oleracea identified 
a total of 1 8, 46, 1 07, and 1 51 unique nonoverlapping indels 
of < 100 bp in size for 5. latifolia, S. vulgaris, S. conica, and 
5. noctiflora, respectively. 

The structures of the 5. noctiflora and 5. conica plastid 
genomes have also been altered by shifts in the boundaries 
between their IRs and single-copy regions. Although the 
precise boundaries of the IR in angiosperms are subject to 
frequent shifts (Goulding et al. 1996), they are generally 
found within the rps19 and ycf! genes. These boundary 



positions appear to be the ancestral state for most angio- 
sperm lineages, including the genus Silene based on 
their presence in both 5. latifolia and 5. vulgaris (fig. 1, 
supplementary fig. S1, Supplementary Material online) 
as well as in Spinacia oleracea, the closest outgroup with 
a sequenced plastid genome (Schmitz-Linneweber et al. 
2001). Silene latifolia and 5. vulgaris share identical bound- 
ary positions between the IR and the large single-copy re- 
gion and differ only slightly (<1 00 bp) in the positions of 
their boundaries between the IR and the small single-copy 
region. In contrast, the IR in 5. noctiflora and 5. conica has 
contracted at the boundary with the large single-copy 
region and expanded at the boundary with the small 
single-copy region (fig. 1). As a result, the IR in 5. noctiflora 
and 5. conica does not contain any portion of rps19 and 
lacks a substantial fraction of rpl2. In addition, the IR 
now includes the entirety of ycfl in both species as well 
as rps15 and a portion of ndhti in 5. noctiflora. Interestingly, 
Fagopyrum esculentum, another member of the Caryophyl- 
lales with a fully sequenced plastid genome, also has an 
expanded IR that contains a full copy of ycfl and which 
is shared with related species in the Polygonaceae and Plum- 
baginaceae (Logacheva et al. 2008, 2009). However, the 
absence of this expansion in other families within the non- 
core Caryophyllales (Logacheva et al. 2009) suggests that it 
evolved multiple times independently, including at least 
once within the genus Silene and another time in a common 
ancestor of the Polygonaceae and Plumbaginaceae. 

Although the directions of the inferred IR boundary shifts 
are the same in both 5. noctiflora and 5. conica, the mag- 
nitudes of the expansions and contractions differ (fig. 1). 
Differences in IR boundary positions account for 0.7 kb of 
the observed 3 kb difference in IR length between 5. nocti- 
flora and 5. conica (table 1). The remaining 2.3 kb difference 
in IR length between these species is the result of indels 
within the IR, particularly in the coding sequences of ycfl 
and ycf2. 

Elevated and Variable Substitution Rates in the Plastid 
Genomes of S. noctiflora and S. conica 

Silene noctiflora and 5. conica exhibit increased substitution 
rates in plastid genes. However, the observed rate acceler- 
ations differ in two important respects relative to the dra- 
matic rate increases in the mitochondrial genomes of 
these species. First, the elevated plastid rates are primarily 
driven by a disproportionate increase in the frequency of 
nonsynonymous substitutions (c/ N ) with only a modest 2- 
or 3-fold change in synonymous divergence (d s ). In contrast, 
mitochondrial d N and d s values have each increased by 
nearly two orders of magnitude in these species, resulting 
in virtually no change in the d N /d s ratio (fig. 3). Second, 
whereas the mitochondrial rate accelerations in 5. noctiflora 
and 5. conica appear to be genome-wide phenomena (Sloan 
et al. 2012), plastid rates differ substantially among genes 
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Fig. 4. — Sequence divergence in Silene plastid genes as measured 
by the estimated number of substitutions per site in the terminal branch 
for each species. The black and white portions of each bar represent the 
amount of synonymous and nonsynonymous divergence, respectively. 
(A) Genes in major photosynthesis-related complexes. (B) Genes coding 
for RNA polymerase subunits and ribosomal proteins. (O Three other 
highly divergent genes (note the 10-fold change in scale). Additional 
plots are available as supplementary material (supplementary fig. S2, 
Supplementary Material online) for individual protein genes, rRNA 
genes, and introns. 



(fig. 4 and supplementary fig. S2, Supplementary Material 
online). 

Plastid genes in five major complexes associated with 
photosynthesis show little rate increase in 5. noctiflora 
and 5. conica (fig. 44), whereas informational protein genes 
including RNA polymerase subunits (to some extent) and 
ribosomal proteins show larger increases (fig. 45). Other 
plastid genes have experienced even greater rate changes, 
including the large ORFs ycfl and ycfl, which are known to 
be essential for cell survival but are otherwise uncharacter- 
ized (Drescher et al. 2000), as well as the protease subunit 
dpP (fig. 4C), which was previously found to have highly 
accelerated substitution rates in multiple species within 
the tribe Sileneae, including 5. conica (Erixon and Oxelman 
2008b). The accD gene, which is required for fatty acid 
biosynthesis, shows some evidence of substitution rate 
acceleration (supplementary fig. S2, Supplementary 
Material online) and has also undergone rapid structural 
evolution, including large deletions in both 5. noctiflora 
and 5. conica. 

Phylogenetic Analysis of Silene Plastid DNA 

An analysis of multiple concatenated data sets did not pro- 
vide a clear consensus on the phylogenetic relationships 
among the four Silene species in this study (supplementary 
fig. S3, Supplementary Material online). A concatenated 
data set of all plastid protein genes except accD, dpP, 
ycfl, and ycf2 supported a sister relationship between 
the fast-evolving 5. noctiflora and 5. conica lineages (supple- 
mentary fig. S3A, Supplementary Material online). However, 
this support disappeared when the analysis was restricted to 
photosynthesis-related genes (supplementary fig. S3B, 
Supplementary Material online), which do not exhibit a his- 
tory of major rate accelerations in 5. noctiflora and 5. conica 
(fig. 4/\). These genes supported a sister relationship be- 
tween 5. latifolia and 5. conica (supplementary fig. S3B, 
Supplementary Material online). Analysis of shared intron 
sequences provided weak support for yet another topology 
with 5. latifolia sister to 5. noctiflora (supplementary fig. 
S3C, Supplementary Material online). In all three analyses, 
internal branch lengths were very short, indicating a rapid 
radiation of these four Silene lineages. 



Discussion 

Recent and Correlated Changes in Mitochondrial and 
Plastid Genome Evolution 

Sequencing of the 5. noctiflora and 5. conica mitochondrial 
genomes revealed that they are exceptional, even when 
compared with the already complex mitochondrial genomes 
of most flowering plants. These mitochondrial genomes ex- 
hibit extreme changes in genome size, structure, and rate of 
sequence evolution (Sloan et al. 2012). In this study, we have 
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Table 2 

Contrasting Patterns of Mitochondrial versus Plastic! Genome Divergence in Silene noctiflora and Silene conica 



Mitochondrial Plastid 

Sequence 

Major genome-wide increase in synonymous substitution rate Yes No 

Large increases in c/ N /d s in a subset of protein genes No Yes 

Large decrease in the frequency of RNA editing Yes No 
Structural 

Genomic expansion Yes No 

Multichromosomal genome structure Yes No 

Gene duplication Yes No 

Elevated indel rate Yes Yes 

Intron losses Only one Yes 

Inversions N/A a Yes 

Shifts in IR boundaries N/A Yes 



a Rates of genome rearrangement between and even within species are so high in angiosperm mitochondrial genomes that estimating the number or rate of inversions in any 
given lineage is not feasible. 



shown that the plastid genomes in these species have also 
experienced recent and rapid divergence that distinguishes 
them from the plastid genomes of most angiosperms, in- 
cluding other members of the same genus. Although com- 
parisons of complete mitochondrial and plastid genome 
sequences have not been performed in other angiosperm 
species with accelerated mitochondrial substitution rates, 
there is some evidence to suggest that similar correlated in- 
creases in the rate of sequence and/or structural evolution in 
both organelle genomes have occurred in lineages such as 
the Geraniaceae and gnetophytes (Parkinson et al. 2005; 
Chumley et al. 2006; Mower et al. 2007; Guisinger et al. 
2008, 2011; McCoy et al. 2008; Wu et al. 2009; Blazier 
et al. 2011). 

Although these cases constitute relatively few indepen- 
dent data points, they nevertheless raise the possibility of 
a shared mechanism affecting both organelle genomes. 
In angiosperms, the mapping and sequencing of plastid 
genomes have far outpaced progress on mitochondrial 
genomes. As a result, there are numerous angiosperm lin- 
eages, such as the Campanulaceae (including the Lobelia- 
ceae), Fabaceae, Goodeniaceae, Oleaceae, Passifloraceae, 
and Ranunculaceae, that have been identified as having ac- 
celerated and/or rearranged plastid genomes, but for which 
we have little or no mitochondrial data (Jansen et al. 2007, 
2008). Many of these lineages contain plastid genomes that 
are far more divergent and rearranged than those found in 
Silene and, therefore, represent a natural starting point for 
generating additional mitochondrial genome sequences. It 
is unlikely that there is any simple or absolute relationship 
between the patterns of evolution in mitochondrial and 
plastid genomes. For example, note that some of the most 
divergent plastid genomes in the Geraniaceae (Blazier et al. 
2011; Guisinger et al. 2011) occur in genera with only 
moderately accelerated mitochondrial substitution rates 
(Parkinson et al. 2005). Nevertheless, a more comprehensive 
comparison of organelle genomes across angiosperms may 



help identify mechanisms that jointly affect mitochondrial 
and plastid genome evolution. 

The idea that rates of sequence evolution might be cor- 
related between mitochondrial and plastid genomes is not 
new. In fact, there are many factors expected to affect rates 
and patterns of evolution at an organismal level (Ohta 1 992; 
Whittle and Johnston 2002; Smith and Donoghue 2008). 
Therefore, one of the intriguing elements of this study is 
not necessarily that the mitochondrial and plastid genomes 
are both highly divergent in 5. noctiflora and 5. conica but 
that they have diverged in such different ways (table 2). Our 
findings raise the question of what evolutionary mecha- 
nisms could generate these correlated, yet distinct patterns 
of divergence between the mitochondrial and the plastid 
genomes. There are many potential answers to this question 
(including simple coincidence), but one intriguing possibility 
involves modification of nuclear genes encoding dual- 
targeted protein products. For example, homologs of the 
bacterial recA gene are known to play an important role 
in plant organelle genome stability, and the Arabidopsis ge- 
nome contains three characterized recA homologs with one 
targeted to plastids, one to mitochondria, and one to both 
organelles (Shedge et al. 2007; Rowan et al. 2010). Modi- 
fication of the dual-targeted gene RECA2 (Shedge et al. 
2007) could affect the evolution of both genomes but in po- 
tentially different ways given the possibility that the gene 
product serves different functional roles in the two organ- 
elles or maintains different levels of redundancy with other 
members of the gene family. 

The discovery and history of the bacterial mutS homolog 
MSH1 may also be informative with respect to correlated 
patterns of evolution between mitochondrial and plastid ge- 
nomes. This nuclear locus was originally named CHM (for 
chloroplast mutator) because mutants exhibited a variegated 
leaf phenotype and modifications in plastid morphology 
that could subsequently be inherited maternally (Redei 
1973). Therefore, it was predicted that disruptions of this 
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nuclear gene destabilize the plastid genome. Subsequent 
work, however, found that the MSH1 gene product is 
targeted to mitochondria, where it regulates recombina- 
tional activity and genome reorganization, and a direct ro- 
le of MSH1 in plastid genome stability became uncertain 
(Martinez-Zapater et al. 1992; Abdelnoor et al. 2003; 
Shedge et al. 2007; Arrieta-Montiel et al. 2009). The docu- 
mented effects of MSH1/CHM on plastids may be partially 
mediated through indirect physiological pathways linking 
these two organelles. Because mitochondria and plastids 
maintain a high degree of functional interdependence, 
(Roussell et al. 1991; Woodson and Chory 2008; Yoshida 
and Noguchi 201 1), it is possible that perturbation of one 
organelle genome will have evolutionary consequences 
for the other. More recent findings indicate that MSH1 is 
also targeted to plastids and may play a direct role in plastid 
genome stability as originally predicted (Xu et al. 2011). 
Therefore, this dual-targeted gene highlights the potential 
for both direct and indirect mechanisms to link the evolution 
of mitochondrial and plastid genomes. MSH1, RECA, and 
other gene families known to be involved in plant organelle 
genome stability (e.g., Zaegel et al. 2006; Cappadocia et al. 
201 0) represent important candidates for further investigation 
in Silene. 

Causes of Substitution Rate Variation Among Plastid 
Genes 

The pattern of mitochondrial substitution rate acceleration 
in 5. noctiflora and 5. conica has been attributed to genome- 
wide increases in the mutation rate (Mower et al. 2007; 
Sloan et al. 2009, 2012). However, a similar interpretation 
is inconsistent with the observed substitution patterns in 
the plastid genomes. The magnitude of rate accelerations 
in 5. noctiflora and 5. conica vary markedly across plastid 
genes. Some of this variation might be explained by "local- 
ized hypermutation" as has been proposed in cases of gene- 
specific rate accelerations in both plastid (Magee et al. 201 0) 
and mitochondrial (Sloan et al. 2009) genomes. However, 
even a model with a diverse range of localized gene-specific 
mutation rates could not explain the disproportional 
increases in c/ N found in many plastid genes in 5. noctiflora 
and 5. conica. Instead, the observed increases in d N /d s sug- 
gest a history of relaxed purifying selection and/or increased 
positive selection acting on a subset of plastid genes in 
5. noctiflora and 5. conica. Increased rates of sequence 
and structural evolution might be associated with the pro- 
cess of functional gene transfer to the nucleus (Magee et al. 
2010). In addition, some loci exhibit c/ N /c/ s ratios that are sig- 
nificantly greater than one when averaged across the entire 
length of the gene (table 3), strongly suggesting at least 
some role for positive selection in the rate accelerations 
observed in these species. 

The differences in substitution rate and d N /d s across func- 
tional classes of plastid genes (fig. 4, supplementary fig. S2, 



Table 3 

Positive Selection on Silene Plastid Genes 



Gene/Complex 


S. noctiflora 


5. conica 


accD 


2.20 


0.98 


cemA 


1.21 


1.48 


dpP 


1.19 


1.31 


rps (concatenated) 


2.23 (0.002) 


1.17 


ycfl 


1.6* 


2.33 (6 x 10~ 6 ) 


ycf2 


1.87 (0.02) 


1.39* 



Note. — All genes (or sets of concatenated genes belonging to a single complex) 
with estimated c/ N /c/ s values greater than 1 are shown. Estimates that are significantly 
greater than 1 are shown in bold with Bonferroni-corrected P values in parentheses. 

Values that are significant based on an uncorrected P value of 0.05 but not after 
Bonferroni correction are marked with an asterisk. 



Supplementary Material online) suggest that changes in se- 
lection pressure may be associated with specific biochemical 
pathways or functions rather than the entire genome. Inter- 
estingly, the patterns of rate variation among genes in 
5. noctiflora and 5. conica exhibit some clear parallels with 
the evolution of plastid genomes within the Geraniaceae, 
which have experienced a longer and more extreme history 
of genome rearrangement (Chumley et al. 2006; Guisinger 
et al. 2008, 2011; Blazier et al. 2011). For example, both 
lineages show a high degree of sequence conservation in 
genes directly involved in photosynthesis and greater levels 
of divergence in other genes such as ribosomal proteins. 
Furthermore, the most divergent genes in 5. noctiflora 
and 5. conica, in particular, accD, dpP, ycfl, and ycf2, have 
been completely lost from the plastid genome in multiple 
lineages within the Geraniaceae (Guisinger et al. 2008, 
2011). The parallels are not perfect, however. Some of 
the highest levels of divergence in the Geraniaceae are 
found in genes coding for RNA polymerase subunits 
(Guisinger et al. 2008), which show only modest accelera- 
tions in S. noctiflora and S. conica (fig. 4). In addition, one 
clade within the Geraniaceae appears to have lost all 
functional copies of its ndh genes (Blazier et al. 2011), 
but these genes are highly conserved in Silene. 

The evolution of plastid genomes in nonphotosynthetic 
angiosperms provides some insight into the patterns of 
selection acting on Silene plastid genes. Not surprisingly, 
evolution of a nonphotosynthetic lifestyle is generally asso- 
ciated with plastid genome reduction and gene loss (Wolfe 
etal. 1992; Delannoy et al. 201 1). Nevertheless, nonphoto- 
synthetic angiosperms retain a plastid genome, demonstrat- 
ing that the functional importance of plastids extends 
beyond photosynthetic pathways (Barbrook et al. 2006; 
Benning et al. 2006). Most of the genes retained in the plas- 
tid genomes of nonphotosynthetic plants are required for 
plastid gene expression. For example, of the 42 functional 
genes identified in the highly reduced plastid genome of the 
parasitic eudicot Epifagus virginiana, only 4 (accD, dpP, ycfl, 
and ycfl) are not involved in plastid gene expression (Wolfe 
et al. 1992). The nonphotosynthetic orchid Rizanthella 
gardneri, which has the smallest sequenced plastid genome 
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of any land plant, has independently converged on a remark- 
ably similar set of genes (including accD, dpP, ycfl , and ycf2) 
(Delannoy et al. 201 1). 

Strikingly, these are the same four genes that exhibit the 
greatest accelerations in the rate of sequence and/or struc- 
tural evolution in 5. noctiflora and 5. conica, suggesting that 
there have been significant changes in selection pressures 
acting on nonphotosynthetic pathways in plastids in both 
Silene species. Although these four genes are widely re- 
tained in land plants (Delannoy et al. 201 1), each has been 
lost from the plastid genome of some lineages, including 
multiple angiosperms (Katayama and Ogihara 1996; Knox 
and Palmer 1999; Chumley et al. 2006; Haberle et al. 
2008; Guisinger et al. 201 1). There is also evidence for pos- 
itive selection acting on these genes in other land plant 
lineages (Erixon and Oxelman 2008b; Greiner et al. 
2008). Knockout experiments in tobacco have shown that 
all four are essential (Drescher et al. 2000; Shikanai et al. 
2001; Kuroda and Maliga 2003; Kode et al. 2005). The 
protein encoded by dpP is a component of a complex multi- 
meric protease with broad substrate specificity within the 
plastid (Peltier et al. 2004; Stanne et al. 2009), whereas accD 
codes for a subunit of the acetyl-CoA carboxylase, which is in- 
volved in fatty acid biosynthesis (Kode et al. 2005). Despite 
the essential nature of ycfl and ycf2 (Drescher et al. 
2000), the functions of these genes have not yet been iden- 
tified. Silene and other lineages with a history of extreme di- 
vergence in accD, dpP, ycfl, and ycf2 may provide an 
opportunity to better understand the more general role of 
these genes and their related pathways in plants. 

Single or Multiple Origins of Accelerated Organelle 
Genome Evolution in Silene7 

The clear similarities between 5. noctiflora and 5. conica in 
the evolution of both mitochondrial and plastid genomes 
raise the obvious question of whether these lineages form 
a monophyletic group that experienced shared ancestral 
changes associated with the organelle genomes. Although 
it is tempting to assume that commonalities between these 
species (e.g., shared intron losses in dpP and rpoC1) reflect 
common ancestry, phylogenetic analyses in other angio- 
sperms have shown that such patterns can occur in parallel 
across independent evolutionary lineages (e.g., Guisinger 
et al. 2011). Therefore, an independent phylogenetic esti- 
mate of the relationships between these Silene lineages is 
necessary. However, such analyses have generally failed 
to resolve relationships among the major lineages of Silene 
subgenus Behenantha , which includes the species examined 
in this study (Erixon and Oxelman 2008a; Sloan et al. 2009). 
Our analyses based on data from complete plastid genomes 
produced similar ambiguities (supplementary fig. S3, Sup- 
plementary Material online). Likewise, an analysis of a small 
number of nuclear genes across this genus found that some 
loci support monophyly between the 5. noctiflora and the 



5. conica lineages, whereas others do not (Rautenberg 
et al. forthcoming). Therefore, the question of single versus 
multiple origins of accelerated organelle genome evolution 
in Silene remains unresolved. Efforts are underway to pro- 
duce deep transcriptome sequencing coverage of multiple 
Silene species. The resulting data set should help disentangle 
the phylogenetic relationships within Silene as well as 
elucidate the cytonuclear interactions that have shaped 
the extreme patterns of organelle genome evolution in this 
genus. 

Supplementary Material 

Supplementary figures S1-S3 and table S1 are available 
at Genome Biology and Evolution online (http:// 
www.gbe.oxfordjournals.org/). 
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